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Abstract 

We present an extensive, annotated bibliography of the abstract machines designed for each of the main programming 
paradigms (imperative, object oriented, functional, logic and concurrent). We conclude that whilst a large number of efficient 
abstract machines have been designed for particular language implementations, relatively little work has been done to design 
abstract machines in a systematic fashion. © 2000 Elsevier Science B.V. All rights reserved. 
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1. What is an abstract machine? 

Abstract machines are machines because they per- 
mit step-by-step execution of programs; they are 
abstract because they omit the many details of real 
(hardware) machines. 

Abstract machines provide an intermediate lan- 
guage stage for compilation. They bridge the gap 
between the high level of a programming language 
and the low level of a real machine. The instructions 
of an abstract machine are tailored to the particu- 
lar operations required to implement operations of a 
specific source language or class of source languages. 

Common to most abstract machines are a program 
store and a state, usually including a stack and regis- 
ters. The program is a sequence of instructions, with a 
special register (the program counter) pointing at the 
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next instruction to be executed. The program counter 
is advanced when the instruction is finished. This ba- 
sic control mechanism of an abstract machine is also 
known as its execution loop. 

1.1. Alternative characterizations 

The above characterization fits many abstract ma- 
chines, but some abstract machines are more abstract 
than others. The extremes of this spectrum are char- 
acterized as follows: 

• An abstract machine is an intermediate language 
with a small-step operational semantics [107]. 

• An abstract machine is a design for a real machine 
yet to be built. 

1.2. Related terms 

The term abstract machine is sometimes also used 
for different concepts and other terms are used for 
the concept of abstract machines, e.g. some authors 
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use the terms emulator or interpreter and some use 
the term virtual machine for implementations of ab- 
stract machines, similar as we use the term program 
for implementations of an algorithm. Sun calls its 
abstract machine for Java the Java Virtual Machine 
[86,91]. The term virtual machine is widely used for 
the different layers of abstractions in operating sys- 
tems [121] and in IBM's VM operating system virtual 
machines are execution environments for running 
several versions of the same operating system on the 
same machine. In theoretical computer science the 
term abstract machine is sometimes used for models 
of computation including finite state machines, Mealy 
machines, push down automata and Turing machines 
[61]. 

1.3. What are abstract machines used for? 

In the above characterization of abstract machines 
their use as an intermediate language for compila- 
tion is an essential feature. As a result the imple- 
mentation of a programming language consists of two 
stages. The implementation of the compiler and the 
implementation of the abstract machine. This is a typ- 
ical divide-and-conquer approach. From a pedagogi- 
cal point of view, this simplifies the presentation and 
teaching of the principles of programming language 
implementations. From a software engineering point 
of view, the introduction of layers of abstraction in- 
creases maintainability and portability and it allows 
for design-by-contract. Abstract machines have been 
successful for the design of implementations of lan- 
guages that do not fit the "Von-Neumann computer" 
well. As a consequence most abstract machines are for 
exotic or novel languages. There are only few abstract 
machines for languages like C or Fortran. Recently 
abstract machines have been used for mobile code in 
heterogenous networks such as the Internet. 

In addition to all their practical advantages abstract 
machines are theoretically appealing as they facilitate 
to prove the correctness of code generation, program 
analyses and transformations [20,111]. 

2. Where do abstract machines come from? 

Abstract machines are often designed in an ad-hoc 
manner based on experience with other abstract ma- 
chines or implementations of interpreters or compilers 



for the same source language. But also some system- 
atic approaches have been investigated. Wand was one 
of the first to deal with the question of deriving ab- 
stract machines from the semantics of a language. In 
1982, he proposed an approach based on combinators 
[130]. To find suitable combinators was not automated 
and was a difficult task, which was simplified in a later 
paper [131]. The CAM (1985) was derived in a similar 
way [34]. Another approach is based on partial eval- 
uation of interpreters with given example programs 
and folding of recurring patterns in the intermediate 
code [44,80,98]. Finally there are approaches based 
on pass separation [45,56,70,89,116]. Pass separation 
is a transformation which splits interpreters into com- 
piling and executing parts, the latter being the abstract 
machine. It has also been used in the 2BIG system 
(1996) to automatically generate abstract machines 
from programming language specifications [43,46]. 

3. Abstract machines for imperative 
programming languages 

Discussions in the late fifties within the ACM and 
other related bodies resulted in various proposals being 
made for an UNCOL: A UNiversal Computer Oriented 
Language. Various UNCOLs have been proposed. 
Conway's machine [33] for example was a register 
machine, with two instructions. Steel's machine [119] 
had sophisticated adressing modes. The principle of 
an UNCOL is sound, but they have not been much 
used. We believe that this is mainly because of the 
lack of performance of the generated code. Chow and 
Ganapathi [30] give an overview of abstract machines 
for imperative programming languages that were cur- 
rent in the mid-1980s. Some believe that the Java Vir- 
tual Machine [86] of the late 1990s might finally play 
the role of an UNCOL, but we think that performance 
will remain a concern in many areas of computing. 

We will now look at some successful abstract ma- 
chines, which were designed for rather more modest 
goals: 

• The Algol Object Code (1964) [109] is an abstract 
machine for Algol60. It has a stack, a heap and 
a program store. Its instructions provide mecha- 
nisms for variable and procedure scope, allocation 
of memory, access to variables and arrays, and 
call-by-value and call-by-name procedure calls. 
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• The P4-machine (1976) is an abstract machine for 
the execution of Pascal programs, developed by 
Wirth and colleagues [7]. The compiler from Pascal 
to P4 and the abstract machine code are documented 
in [102]. The P4 machine has fixed-length instruc- 
tions. It implements block structure by a stack of ac- 
tivation records (frames), using dynamic and static 
links to implement recursion and static scoping, re- 
spectively. 

• The UCSD P-machine [32] is an abstract ma- 
chine for the execution of Pascal programs, with 
variable-length instructions. The compact bytecode 
of the machine has special instructions for calling 
Pascal's nested procedures, for calling formal pro- 
cedures, for record and array indexing and index 
checks, for handling (Pascal) sets, for signalling 
and waiting on semaphores, etc. The P-machine 
was used in the popular UCSD Pascal system for 
microcomputers (ca. 1977). A commercial hard- 
ware implementation of the P-machine was made 
(see Section 1 1). 

• Forth (1970) may be considered as a directly exe- 
cutable language of a stack-based abstract machine: 
expressions are written in postfix (reverse Polish no- 
tation), a subroutine simply names a code address, 
etc. [77,94]. 

4. Abstract machines for object-oriented 
programming languages 

Abstract machines for object-oriented languages are 
typically stack-based and have special instructions for 
accessing the fields and methods of objects. Memory 
management is often implicit (done by a garbage col- 
lector) in these machines. 

• Smalltalk-80 (1980) is a dynamically typed 
class-based object-oriented language, implemented 
by compilation into a stack-based virtual machine 
code. The bytecode has instructions for stack ma- 
nipulation, for sending a message to an object (to 
access a field or invoke a method), for return, for 
jump, and so on [51] (the second edition [52] omits 
most of the material on the virtual machine). 

• Self (1989) is a dynamically typed class-less 
object-oriented language. Self has a particularly 
simple and elegant stack-based virtual machine 
code: every instruction has a three-bit instruction 



op-code and a five-bit 'index', or instruction argu- 
ment. The eight instructions are: push self, push 
literal, send message (to invoke a method or access 
a field), self send, super send, delegate (to a par- 
ent), return, and index extension. The bytecode is 
dynamically translated into efficient machine code 
[28,29]. 

• Java (1994) is a statically typed class-based 
object-oriented language, whose 'official' interme- 
diate language is the statically typed Java Virtual 
Machine (JVM) bytecode. The JVM has special 
support for dynamic loading and linking, with 
load-time verification (including type checking) of 
the bytecode. The instruction set supports object 
creation, field access, virtual method invocation, 
casting an object to a given class, and so on [86]. 
For hardware implementations of the JVM (see 
Section 11). 

5. Abstract machines for string processing 
languages 

A string processing language is a programming lan- 
guage that focuses on string processing rather than 
processing numeric data. String processing languages 
have been around for decades in the form of com- 
mand shells, programming tools, macro processors, 
and scripting languages. This latter category has be- 
come prominent as scripting language are used to 
'glue' components together [101]. The components 
are typically written in a (systems) programming lan- 
guage, such as C, but they may be glued components 
themselves. 

String processing languages are either implemented 
by interpreting a proprietary representation of the 
source text, or the implementation is based on some 
low level abstract machine. There are two reasons for 
using a proper abstract machine: improved execution 
speed and better portability. Machine independence 
has become less of an issue in recent years, because the 
number of different computer architectures has fallen 
dramatically over time, and because C acts as a lingua 
franca to virtually every platform currently in use. 

We will discuss two prominent examples of early 
string processing languages, where an abstract ma- 
chine is used mainly to achieve machine indepen- 
dence. 
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• Snobol4 [54] is a string processing language with 
a powerful pattern matching facility. The language 
has been used widely to build compilers, symbolic 
algebra packages, etc. The Snobol4 abstract ma- 
chine (SIL) operates on data descriptors, which con- 
tain either scalar data or references, as well as the 
type of the data and some control information. The 
data representation makes it possible to treat strings 
as variables, and to offer data directed dispatch of 
operations, much in the same way as object oriented 
systems offer today. The machine operates a pair of 
stacks, a garbage collected heap (mark scan). The 
instruction set is designed firstly to provide efficient 
support for the most common operations and sec- 
ondly to ease the task of porting it [53]. 

• ML/I [23] is a macro processor. Macro processors 
are based on a substitution model, whereas ordinary 
string processors treat strings as data to which oper- 
ations are applied. Macro processors are generally 
more difficult to program than ordinary string pro- 
cessors. The ML/I macro processor is implemented 
via the LOWL abstract machine. This machine 
offers two stacks, three registers, non-recursive 
sub-routines and a small set of instructions. Porta- 
bility has been the major driver for the design. 
UNIX has had a profound influence on what we 

consider scripting languages today. With UNIX came 
the now classical tool-set comprising the shell, awk, 
and make. As far as we know, all of these are imple- 
mented using an internal representation close to the 
source text. Descendants of these tools are now ap- 
pearing that use abstract machines again, mainly for 
speed but also for machine independence: 

• Awk [1] constructs a parse tree from the source. The 
interpreter then traverses the parse tree, interpret- 
ing the nodes. Interior nodes correspond to an op- 
erator or control flow construct; leaves are usually 
pointers to data. Interpreter functions return cells 
that contain the computed results. Control flow in- 
terruptions like break, continue, and function return 
are handled specially by the main interpreter. 

• Nmake [49] is a version of the make tool for 
UNIX, which provides a more flexible style of de- 
pendency assertions. To be able to port these new 
make files to older systems, Nmake can translate 
its input into instructions for the Make Abstract 
Machine (MAM). These are easy to translate into 
more common Makefile formats [78]. 



• Tel [100] is a command language designed to be 
easily extensible with application specific, com- 
piled commands. The most widely know applica- 
tion of Tel is the Tk library for building Graphical 
User Interfaces. The flexibility of Tel is achieved 
primarily by representing all data as strings and 
by using a simple and uniform interface to com- 
mands. For example the while construct from the 
Tel language is implemented by a C procedure, 
taking two strings as arguments. The first string 
is the conditional expression and the second is the 
statement to be executed. The C procedure calls 
the Tel command interpreter recursively to evaluate 
the conditional and the statements ([100], p. 321). 
The abstract machine does not have any stacks of 
its own, it relies on the C implementation. 

Since version 8.0 Tel uses a bytecode interpreter 
[74]. 

• Perl [128] is a scripting language, with an enor- 
mous collection of modules for a wide range of 
applications, such as building CGI scripts for Web 
servers. The implementation compiles Perl code 
into an intermediate, tree structured representation, 
with each instruction pointing to the next. The 
abstract machine has seven stacks which are ex- 
plicitly manipulated by the compiled instructions. 
There are six different data types, and over 300 
instructions. Reference counting is used to perform 
storage management [118]. 

• Python is an object oriented scripting language [87]. 
Python is implemented using a stack based abstract 
machine. The instructions are rather like method 
calls, dispatching on the type of the operands found 
on the stack. There are over 100 instructions, or- 
ganized as segments of code, with jumps to alter 
the flow of control. Python uses a reference count 
garbage collector. 

Hugunin [63] has created an implementation of 
JPython, which targets the Java Virtual Machine 
instead. 

The performance of the scripting languages has 
above been studied by a number authors. Kernighan 
and van Wyk [74] compare Awk, Perl, Tel, Java, 
Visual Basic, Limbo, C and Scheme. They show 
that depending on the benchmark and the platform, 
C and Java sometimes do worse than the scripting 
languages. Romer et al. [110], benchmark Java, Perl 
and Tel using a cache level simulator of the MIPS 
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architecture. They conclude that eventhough scripting 
language perform less well than C, special hardware 
support is not warranted. 

6. Abstract machines for functional programming 
languages 

The first abstract machines for functional languages, 
such as the SECD [81] and FAM [26], defined strict 
evaluation, also known as eager or call-by-value eval- 
uation, in which function arguments are evaluated be- 
fore the call, and exactly once. More recently, most 
work has focused on lazy (or call-by-need) evalua- 
tion, in which function arguments are evaluated only 
if needed, and at most once. One reason is that effi- 
cient implementation of strict evaluation is now well 
understood, so that the need to go via an abstract ma- 
chine has diminished. 

Central concepts in abstract machines for functional 
languages include: 

• A stack in general represents the context of a nested 
computation. It may hold the intermediate results of 
pending computations, activation records of active 
function invocations, active exception handlers, etc. 
The stack is sometimes used also for storing argu- 
ments to be passed to functions. 

• An environment maps program variables to their 
values. 

• A closure is used to represent a function as a value. 
It typically consists of a code address (for the func- 
tion body) and an environment (binding the free 
variables of the function body). 

• A heap stores the data of the computation. Abstract 
machines usually abstract away from the details of 
memory management, and thus include instructions 
for allocating data structures in the heap, but not for 
freeing them; the heap is assumed to be unlimited. 

• A garbage collector supports the illusion that the 
heap is unlimited; it occasionally reclaims unreach- 
able heap space and makes it available for alloca- 
tion of new objects. 

6.1. Strict functional languages 

• The SECD machine (1964) was designed by Landin 
for call-by-value evaluation of the pure lambda cal- 
culus [81]. The machine derives its name from the 



components of its state: an evaluation stack S, an 
environment E, a control C holding the instructions 
to execute, and a dump D holding a continuation, 
(i.e., a description of what must be done next). 

• Cardelli's Functional Abstract Machine (1983) is 
a much extended and optimized SECD machine 
used in the first native-code implementation of ML 
[26,27]. 

• The Categorical Abstract Machine (1985) was de- 
veloped by Cousineau et al. [34]. Its instructions 
correspond to the constructions of a Cartesian 
closed category: identity, composition, abstraction, 
application, pairing, and selection. It was the base 
for the CAML implementation of ML. 

• The Zinc Abstract Machine (1990) developed by 
Leroy [82] permits more efficient execution. It is an 
optimized, strict version of the Krivine machine (see 
Section 6.2 below). This machine is the basis of the 
bytecode versions of Leroy' s Caml Light [35,135] 
and Objective Caml implementations, and is used 
also in Moscow ML [117]. 

6.2. Lazy functional languages 

In a lazy language, function and constructor argu- 
ments are evaluated only if needed, and then at most 
once. Although this can be implemented by represent- 
ing an unevaluated argument by a 'thunk', a function 
that will evaluate the argument and replace itself with 
the result, efficiency calls for other approaches. An im- 
portant idea due to Wadsworth is to represent the pro- 
gram by a graph which is rewritten by evaluation. The 
evaluation (rewriting) of a shared subgraph will auto- 
matically benefit all expressions referring to it. How- 
ever, repeatedly searching a graph for subexpressions 
to rewrite is slow. 

Early implementations compiled the program to a 
fixed set of combinators (closed lambda terms all of 
whose abstractions are at the head); these may be 
thought of as graph rewriting rules [123]. Later it was 
shown to be beneficial to let the program under con- 
sideration guide the choice of combinators (so-called 
supercombinators) [62]. 

• In his seminal paper [123], David Turner describes 
the SK-machine to support the implementation of 
S ASL. The compiler is based on the equivalence be- 
tween combinatory logic [113] and the A -calculus 
[40]. It generates code for what is essentially a two 
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instruction machine. To make the machine more ef- 
ficient, Turner added further instructions, each with 
a functionality that is provably equivalent to a num- 
ber of S and K combinators. 

• The G-machine (1984) was designed by Augusts- 
son and Johnsson for lazy (call-by-need) evaluation 
of functional programs in supercombinator form 
[10,68,104]. Instead of interpreting supercombi- 
nators as rewrite rules, they were compiled into 
sequential code with special instructions for graph 
manipulation. The G-machine is the basis of the 
Lazy ML [11] and HBC Haskell [13] implementa- 
tions. 

• The Krivine machine (1985) is a simple abstract 
machine for call-by-name evaluation (i.e. without 
sharing of argument evaluation) of the pure lambda 
calculus [39]. It has just three instructions, cor- 
responding to the three constructs of the lambda 
calculus: variable access, abstraction, and appli- 
cation. A remarkable feature is that the argument 
stack is also the return stack (continuation). 

• The Three Instruction Machine TIM (1986) is a 
simple abstract machine for evaluation of super- 
combinators, developed by Fairbairn and Wray 
[48]. The basic call-by-name version of this ma- 
chine is quite similar to the Krivine machine. A 
lazy (call-by-need) version needs extra machinery 
to update shared function arguments; it is somewhat 
complicated to implement this efficiently [8]. 

• The Krivine machine can be made lazy just as the 
TIM [36,37,1 15]. Alternatively one may add an ex- 
plicit heap and a single new instruction for making 
recursive let-bindings [116]. The resulting machine 
has been used in some theoretical studies, e.g. [112]. 

• The Spineless-Tagless G-machine (1989) was de- 
veloped by Peyton Jones as a refinement of the 
G-machine [105]. It is used in the Glasgow Haskell 
compiler [103]. 

There are many more abstract machines for func- 
tional languages than we can mention here. Typically 
they were developed for theoretical study, or during 
the work on some novel language or implementation 
technique. 

It is ultimately the performance that decides whether 
an abstract machine has been well designed. A com- 
prehensive overview of over 25 functional language 
implementations is provided in the Pseudoknot bench- 
mark [58]. 



7. Abstract machines for logic programming 
languages 

Logic programming languages are based on predi- 
cate calculus. The program is given as a finite set of 
inference rules. The execution of a logic program per- 
forms logical inferences. Prolog is the most promi- 
nent logic programming language. In Prolog the rules 
are in a standard form known as universally quantified 
'Horn clauses'. A goal statement is used to start the 
computation which tries to find a proof of this goal. 

Most research in compiling of Prolog programs is 
centered around the Warren Abstract Machine WAM 
(1983) which has become the de facto standard [133]. 
It offers special purpose instructions, which include 
unification instructions for various kinds of data and 
control flow instructions to implement backtracking. 
The original report by Warren [132] gives just the bare 
bones and there have been several efforts to present 
the WAM in a pedagogical way [2,50,138]. The WAM 
uses four memory areas: heap, stack, trail, and PDL. 

• The WAM allocates structures and variables on the 
heap. Garbage collection automatically reclaims 
heap space allocated by structures and variables 
which are no longer reachable from the program. 

• The stack contains choice points and environments. 
In a choice point there are entries for the address 
of the previous choice point, the next alternative 
clause (continuation pointer) and to store some of 
the registers of the WAM. An environment con- 
sists of the permanent variables in a clause. Con- 
ceptually the stack can be divided into two stacks, 
called the AND-and OR-stacks. The AND-stack 
contains environments and the OR-stack contains 
choice points. 

• On the trail, the WAM keeps track of which bind- 
ings have to be retracted after a clause fails and be- 
fore an alternative clause can be tried, i.e. during 
backtracking. 

• Finally the push down list, PDL, contains pairs of 
nodes, which have to be considered next by the uni- 
fication algorithm. Unification matches the current 
goal with the head of a clause and binds variables 
in both the goal and the head. 

Research has focussed on the generation of opti- 
mized WAM code and resulting extensions and mod- 
ifications of the WAM have been proposed. Some of 
the techniques, which have been investigated, are in- 
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dexing of clauses, environment trimming, register al- 
location [66,88] and tabling of goals [108]. Data flow 
analysis, in particular abstract interpretation, and tail 
recursion optimizations have been the basis of efficient 
implementations of Prolog [125,127]. 

8. Abstract machines for hybrid programming 
languages 

Programming language researchers try to combine 
the best of different language paradigms in hybrid pro- 
gramming languages. For functional logic languages 
abstract machines have been designed as extensions 
of abstract machines for functional languages [79,97] 
or as extensions of the WAM [21,57]. The WAM has 
also been the basis for abstract machines for constraint 
logic programming languages [16,65], the concurrent, 
constraint logic programming language OZ [90] and 
the concurrent, real time language Erlang [9]. 

9. Abstract machines for parallel programming 
languages 

As noted by Blair [18], parallel and distributed mod- 
els converge due to trends towards high-speed net- 
works, platform independence and micro-kernel based 
operating systems. Several such models are discussed 
in [73], most notably the Parallel Virtual Machine 
PVM (1990) which serves as an abstraction to pro- 
gram sets of heterogeneous computers as a single com- 
putational resource [15,120]. The Threaded Abstract 
Machine TAM (1993) [38] and a similar, but simpler 
abstract machine [3,47] have been proposed as gen- 
eral target architectures for multi-threading on highly 
parallel machines. 

Parallel and distributed architectures provide com- 
putation power which programming language imple- 
mentations on these systems try to exploit. 

Pure functional languages are referentially trans- 
parent, and parallel evaluation of e.g. the arguments 
of a function invocation would seem a promising 
idea. Indeed, several abstract machines have been 
suggested which implement parallel graph reduction, 
e.g. (v, G)-machine [12], GRIP [106], GUM [122], 
and DREAM [22]. Also an abstract machine for par- 
allel proof systems [67] is based on parallel graph 



reduction. A critical review of parallel implementa- 
tions of functional languages, in particular of lazy 
languages, is given by Wilhelm et al. [137]. They 
observe that the exploitation of natural parallelism 
in functional programming languages has not been 
successful so far [137]. In general, giving the pro- 
grammer control over the parallelism in the language 
allows for better results, e.g. the PCKS-machine 
[96]. 

Parallelism naturally arises in logic programming: 
Several clauses for the same goal (AND-parallelism) 
or all goals in a clause (OR-parallelism) can be tried in 
parallel. AND-Parallel models have been proposed in 
[59,85], some of the OR-Parallel models are the SRI 
model [134], the Argonne model [24], the BC machine 
[4], the MUSE model [5,6] and the model proposed in 
[31]. Furthermore there have been attempts to combine 
AND and OR parallelism [55]. There have also been 
parallel abstract machines, which are totally different 
from the WAM. One of these is the PPAM [71], which 
is based on a dataflow model. 

Some of the important issues in implementing par- 
allel abstract machines for programming languages are 
static and dynamic scheduling [19,1 14], granularity of 
tasks, distributed garbage collection [69] and code and 
thread migration [126,136]. 

10. Special-purpose abstract machines 

Abstract machines are not only used for translation 
of programming languages, but also as intermediate 
levels of abstraction for other purposes. Term rewriting 
[42] is a model of computation used in various areas 
of computer science, including symbolic computation, 
automated theorem proving and execution of algebraic 
specifications. Abstract machines for term rewriting 
systems include the abstract rewriting machine ARM 
[72], /uARM [129] and TRAM [99]. 

Portability is the main reason for the success of 
DVI [76] and PostScript [64] as page-description lan- 
guages. DVI is a simple language without control-flow 
constructs, whereas PostScript is a full programming 
language in the tradition of Forth. Both are used as 
intermediate languages by text processing systems. 
They are either further compiled into the language of 
a certain printer or interpreted by the printer's built-in 
processor. 
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In natural language parsing abstract machines based 
on the WAM are investigated. In contrast to usual Pro- 
log programs, terms in unification grammars tend to 
be large and thus efficient unification instructions are 
added to the WAM [17,139]. 

The Hypertext Abstract Machine (1988) is a server 
for a hypertext storage system [25]. The data struc- 
tures of the machine are graphs, contexts, nodes, links 
and attributes. The instructions of the machine initi- 
ate transactions with the server to access and modify 
these. 

11. Concrete abstract machines 

A computer processor (CPU) could be considered a 
concrete hardware realization of an abstract machine, 
namely the processor's design. While this view is 
rather extreme when applied to processors such as the 
x86, SPARC, MIPS, or HP-PA, it makes more sense 
when applied to unconventional, special-purpose pro- 
cessors or abstract machines. 

For many years (roughly, from the early 1970s to 
the late 1980s) it was believed that efficient imple- 
mentation of symbolic languages, such as functional 
and logic languages, would require special-purpose 
hardware. Several such hardware implementations 
were undertaken, and some resulted in commercial 
products. However, the rapid development of conven- 
tional computer hardware, and advances in compiler 
and program analysis technology, nullified the advan- 
tages of special-purpose hardware, even microcoded 
implementations. Special-purpose hardware was too 
expensive to build and could not compete with stock 
hardware. 

A tell-tale sign is that the conference series Func- 
tional Programming and Computer Architecture (ini- 
tiated in 1981) published few papers on concrete 
computer architecture, and when merging with the 
Lisp conference series in 1996, dropped 'Computer 
Architecture' from its title. 

Some examples of concrete hardware realizations 
of abstract machines are: 

• The Burroughs B5000 processor (1961) had hard- 
ware support for efficient stack manipulation, such 
as keeping the top few elements of the stack in 
special CPU registers. The goal was to support 
the implementation of Algol 60 and other block- 



structured languages. Subsequently many machines 
with hardware stack support have been developed 
(see [77]). 

• A Lisp machine project was initiated at MIT in 
1974 and led to the creation of the company Sym- 
bolics in 1980. Symbolics Lisp Machines had a 
special-purpose processor and memory, with sup- 
port for e.g. the run-time type tags required by 
Lisp. The entire operating system and development 
environment were written in Lisp. By 1985 the 
company had sold 1500 Lisp Machines; by 1996 it 
was bankrupt. 

• The Pascal Microengine Computer (1979) is a 
hardware implementation of the UCSD P-code ab- 
stract machine [92](see Section 3). Analogously 
to the Lisp machines, the operating system and 
development environment were written entirely in 
Pascal. The machine was commercially available 
in the early 1980s. 

• ALICE (Applicative Language Idealized Comput- 
ing Engine) [41] by Darlington and Reeve was the 
first hardware implementation of a reduction ma- 
chine. It was built using 40 transputers connected 
by a multi-stage switching network. 

• Kieburtz and others (1985) designed a hardware 
implementation of Augustsson and Johnsson's 
G-machine (see Section 6.2) for graph reduction of 
lazy functional languages [75]. Simulations of the 
processor and memory management system were 
done, but the hardware was never built. 

• The Norma was created by the Burroughs company 
(1986) as a research processor for high speed graph 
reduction in functional programming languages 
(see e.g. [77]). 

• Scheme-81 is a chip implementing an evaluator 
(abstract machine) for executing Scheme [14]. 

• A number of special-purpose machines for Prolog 
execution have been developed, mostly based on the 
WAM and modifications thereof. Several machines 
were designed within the Japanese Fifth Genera- 
tion Project (1982-1992), and a total of around 800 
such machines were built; they were used mostly 
inside the project. Other hardware implementations 
include the KCM project (1988) at ECRC in Eu- 
rope, and the VLSI-BAM, an implementation of the 
Berkeley Abstract Machine, designed at Berkeley 
in the USA. For more information and references 
(see [124], Section 3.2). 
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• The Transputer (1984) [84] is a special-purpose 
microprocessor for the execution of Occam [83], a 
parallel programming language with synchronous 
communication, closely based on Hoare's theoreti- 
cal language CSP [60]. It has special hardware and 
instructions for creating networks of Transputers. 

• MuP21 (1993) is an extremely simple but fast mi- 
croprocessor for executing Forth programs. It has 
two stacks, 20-bit words, just 24 (five-bit) instruc- 
tions, and its implementation requires only 7000 
transistors [95]. 

• A series of Java microprocessors which directly 
execute Java Virtual Machine bytecode (see Sec- 
tion 4) and support also conventional imperative 
languages was announced by Sun Microsystems 
in 1996. Technical specifications for the microJava 
701 processor were available by early 1998 [93], 
but apparently the chip was not yet in volume 
production by early 1999. 



12. Conclusion 

For almost 40 years abstract machines have been 
used for programming language implementation. As 
new languages appear, so will abstract machines as 
tools to handle the complexity of implementing these 
languages. While abstract machines are a useful tool 
to bridge the gap between a high level language and a 
low level architecture, much work remains to be done 
to develop a theory of abstract machines. Such a theory 
is necessary to support the systematic development 
of abstract machines from language and architecture 
specifications. 
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